148 PART 4 Comparing Groups
Executing a t test
Statistical software packages contain commands that can execute (or run) t tests
(see Chapter 4 for more about these packages). The examples presented here use
R, and in this section, we explain the data structure required for running the var-
ious t tests in R. For demonstration, we use data from the National Health and
Nutrition Examination Survey (NHANES) from 2017–2020 file (available at wwwn.
cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?Cycle=2017-2020).»
» For the one-group t test, you need the column of data containing the
variable whose mean you want to compare to the hypothesized value (H), and
you need to know H. R and other software enable you to specify a value for H
and assumes 0 if you don’t specify anything. In the NHANES data, the fasting
glucose variable is LBXGLU, so the R code to test the mean fasting glucose
against a maximum healthy level of 100 mg/dL in an R dataframe named
GLUCOSE is t.test(GLUCOSE$LBXGLU, mu = 100).»
» For the paired t test, you need two columns of data representing the pair of
numbers you want to enter into the paired t test. For example, in NHANES,
systolic blood pressure (SBP) was measured in the same participant twice
(variables BPXOSY1 and BPXOSY2). To compare these with a paired t test in an
R dataframe named BP, the code is t.test(BP$BPXOSY1, BP$BPXOSY2, paired =
TRUE).»
» For the independent t test, you need to have one column coded as the
grouping variable (preferable with a two-state flag coded as 0 and 1), and
another column with the value you want to test. We created a two-state flag in
the NHANES data called MARRIED where 1 = married and 0 = all other marital
statuses. To compare mean fasting glucose level between these two groups in
a dataframe named NHANES, we used this code: t.test(NHANES$LBXGLU ~
NHANES$MARRIED).
TABLE 11-1
How t Tests Calculate Difference, Standard Error, and
Degrees of Freedom
One-Group
Paired
Unpaired t
Equal Variance
Welch t Unequal Variance
D
Difference between mean
of observations and a
hypothesized value (h)
Mean of
paired
differences
Difference between
means of the two
groups
Difference between means of
the two groups
SE
SE of the observations
SE of paired
differences
SE of difference, based
on a pooled estimate of
SD within each group
SE of difference, from SE of
each mean, by propagation of
errors
df
Number of
observations – 1
Number of
pairs – 1
Total number of
observations – 2
“Effective” df, based on the
size and SD of the two groups